Automatic Variable Selection for High-Dimensional Linear Models with Longitudinal Data
نویسندگان
چکیده
High-dimensional longitudinal data arise frequently in biomedical and genomic research. It is important to select relevant covariates when the dimension of the parameters diverges as the sample size increases. We consider the problem of variable selection in high-dimensional linear models with longitudinal data. A new variable selection procedure is proposed using the smooth-threshold generalized estimating equation and quadratic inference functions (SGEE-QIF) to incorporate correlation information. The proposed procedure automatically eliminates inactive predictors by setting the corresponding parameters to be zero, and simultaneously estimates the nonzero regression coefficients by solving the SGEE-QIF. The proposed procedure avoids the convex optimization problem and is flexible and easy to implement. We establish the asymptotic properties in a high-dimensional framework where the number of covariates n p increases as the number of cluster n increases. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedure.
منابع مشابه
Variable selection for generalized linear mixed models by L 1-penalized estimation
Generalized linear mixed models are a widely used tool for modeling longitudinal data. However , their use is typically restricted to few covariates, because the presence of many predictors yields unstable estimates. The presented approach to the fitting of generalized linear mixed models includes an L 1-penalty term that enforces variable selection and shrinkage simultaneously. A gradient asce...
متن کاملAutomatic Variable Selection for Single-Index Random Effects Models with Longitudinal Data
We consider the problem of variable selection for the single-index random effects models with longitudinal data. An automatic variable selection procedure is developed using smooth-threshold. The proposed method shares some of the desired features of existing variable selection methods: the resulting estimator enjoys the oracle property; the proposed procedure avoids the convex optimization pro...
متن کاملA Comparative Review of Selection Models in Longitudinal Continuous Response Data with Dropout
Missing values occur in studies of various disciplines such as social sciences, medicine, and economics. The missing mechanism in these studies should be investigated more carefully. In this article, some models, proposed in the literature on longitudinal data with dropout are reviewed and compared. In an applied example it is shown that the selection model of Hausman and Wise (1979, Econometri...
متن کاملStatistical inference in high dimensional linear and AFT models
A large amount of previous literature proposed and studied variable selection procedures for high dimensional data, and most of the researchers focused on the selection properties as well as the point estimation properties. However, there have been limited studies considering the construction of confidence intervals for the highdimensional variable selection problems. In this thesis, we propose...
متن کاملComparison of Methods for Variable Selection in High-Dimensional Linear Mixed Models
Abstract. Currently is the analysis of high-dimensional data a popular field of research, thanks to many applications e.g. in genetics. At the same time, the type of problems that tend to arise in genetics, can often be modeled using LMMs in conjunction with high-dimensional data. In this paper we introduce two new methods and briefly compare them to existing methods, which can be used for vari...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014